智能论文笔记

The Third Place Solution for CVPR2022 AVA Accessibility Vision and Autonomy Challenge

Bo Yan , Leilei Cao , Zhuang Li , Hongbin Wang

分类：计算机视觉 | 人工智能

2022-06-28

AVA挑战的目标是提供与可访问性相关的基于视觉的基准和方法。在本文中，我们将提交的技术细节介绍给CVPR2022 AVA挑战赛。首先，我们进行了一些实验，以帮助采用适当的模型和数据增强策略来完成此任务。其次，采用有效的培训策略来提高性能。第三，我们整合了两个不同分割框架的结果，以进一步提高性能。实验结果表明，我们的方法可以在AVA测试集上获得竞争结果。最后，我们的方法在CVPR2022 AVA挑战赛的测试集上实现了63.008 \％ap@0.50：0.95。

translated by 谷歌翻译

Bilateral Network with Channel Splitting Network and Transformer for Thermal Image Super-Resolution

Bo Yan , Leilei Cao , Fengliang Qi , Hongbin Wang

分类：计算机视觉 | 机器学习

2022-06-24

近年来，热图像超分辨率（TISR）问题已成为一个有吸引力的研究主题。 TISR将用于各种领域，包括军事，医疗，农业和动物生态学。由于PBVS-2020和PBVS-2021研讨会挑战的成功，TISR的结果不断改善，并吸引了更多的研究人员注册PBVS-2022挑战。在本文中，我们将向PBVS-2022挑战介绍我们提交的技术细节，该挑战设计具有频道拆分网络和变压器（BN-CSNT）的双边网络以解决TISR问题。首先，我们设计了一个基于带有变压器的频道拆分网络的上下文分支，以获取足够的上下文信息。其次，我们设计了一个带有浅变压器的空间分支，以提取可以保留空间信息的低水平特征。最后，对于上下文分支，为了融合通道拆分网络和变压器的功能，我们提出了一个注意力改进模块，然后通过建议的特征融合模块融合了上下文分支和空间分支的特征。所提出的方法可以实现X4的PSNR = 33.64，SSIM = 0.9263，PSNR = 21.08，SSIM = 0.7803在PBVS-2022挑战测试数据集中X2的SSIM = 0.7803。

translated by 谷歌翻译

The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation

Leilei Cao , Zhuang Li , Bo Yan , Feng Zhang , Fengliang Qi , Yuchen Hu , Hongbin Wang

分类：计算机视觉

2022-06-24

引用视频对象细分任务（RVO）的目的是在所有视频框架中通过语言表达式引用的给定视频中的对象实例。由于需要在各个实例中理解跨模式语义，因此此任务比传统的半监督视频对象细分更具挑战性，在该视频对象分割中，在第一帧中给出了地面真相对象掩盖。随着变压器在对象检测和对象细分方面的巨大成就，RVOS已取得了显着的进步，而Reformen to Reformer实现了最新的性能。在这项工作中，基于强大的基线框架 - 引用者，我们提出了几个技巧来进一步提高，包括周期性学习率，半监督方法和测试时间增加推断。改进的推荐子在CVPR2022上排名第二，参考YouTube-VOS挑战。

translated by 谷歌翻译

The Second Place Solution for ICCV2021 VIPriors Instance Segmentation Challenge

Bo Yan , Fengliang Qi , Leilei Cao , Hongbin Wang

分类：计算机视觉

2021-12-02

用于数据有效的计算机视觉挑战的视觉感应前瞻挑战要求竞争对手从数据缺陷的设置中从头划痕培训模型。在本文中，我们向ICCV2021 Vipriors实例分割挑战介绍了我们提交的技术细节。首先，我们设计了一种有效的数据增强方法，以改善数据缺陷的问题。其次，我们进行了一些实验来选择适当的模型，并对这项任务进行了一些改进。第三，我们提出了一种有效的培训策略，可以提高性能。实验结果表明，我们的方法可以在测试集上实现竞争结果。根据竞争规则，我们不使用任何外部图像或视频数据和预先训练的权重。上面的实现细节在第2节和第3节中描述了。最后，我们的方法可以在ICCV2021 Vipriors实例分割挑战的测试集上实现40.2 \％@ 0.50：0.95。

translated by 谷歌翻译

Stronger Baseline for Person Re-Identification

Fengliang Qi , Bo Yan , Leilei Cao , Hongbin Wang

分类：计算机视觉

2021-12-02

人重新识别（RE-ID）旨在确定非重叠捕获摄像机的同一个人人员，这在视觉监控应用和计算机视觉研究领域起着重要作用。由于高广阔的注释未标记数据的标识，拟合有限的基于外观的表示提取器具有有限的收集的训练数据对于人物重新ID是至关重要的。在这项工作中，我们为人员重新ID提出了更强大的基线，即当前现行方法的增强版本，即强大的基线，具有微小的修改，但更快的收敛速度和更高的识别性能。借助于更强大的基线，我们在2021个vipriors中获得了第三名（即0.94，在地图中）重新识别挑战，而没有基于想象的预训练的参数初始化和任何额外的补充数据集的辅助。

translated by 谷歌翻译

TBN-ViT: Temporal Bilateral Network with Vision Transformer for Video Scene Parsing

Bo Yan , Leilei Cao , Hongbin Wang

分类：计算机视觉

2021-12-02

视频场景在野外与不同方案进行了解析，是一个具有挑战性和重要的任务，特别是随着自动驾驶技术的快速发展。野外（VSPW）中的数据集视频场景分析包含良好的修整长时间，密度注释和高分辨率剪辑。基于VSPW，我们设计具有视觉变压器的时间双边网络。我们首先使用卷积设计空间路径以产生能够保留空间信息的低级功能。同时，采用具有视觉变压器的上下文路径来获得足够的上下文信息。此外，时间上下文模块被设计为利用帧间内容信息。最后，该方法可以实现VSPW2021挑战测试数据集的49.85 \％的Union（Miou）的平均交叉点。

translated by 谷歌翻译

Boosting Neural Networks to Decompile Optimized Binaries

Ying Cao , Ruigang Liang , Kai Chen , Peiwei Hu

分类：机器学习

2023-01-03

Decompilation aims to transform a low-level program language (LPL) (eg., binary file) into its functionally-equivalent high-level program language (HPL) (e.g., C/C++). It is a core technology in software security, especially in vulnerability discovery and malware analysis. In recent years, with the successful application of neural machine translation (NMT) models in natural language processing (NLP), researchers have tried to build neural decompilers by borrowing the idea of NMT. They formulate the decompilation process as a translation problem between LPL and HPL, aiming to reduce the human cost required to develop decompilation tools and improve their generalizability. However, state-of-the-art learning-based decompilers do not cope well with compiler-optimized binaries. Since real-world binaries are mostly compiler-optimized, decompilers that do not consider optimized binaries have limited practical significance. In this paper, we propose a novel learning-based approach named NeurDP, that targets compiler-optimized binaries. NeurDP uses a graph neural network (GNN) model to convert LPL to an intermediate representation (IR), which bridges the gap between source code and optimized binary. We also design an Optimized Translation Unit (OTU) to split functions into smaller code fragments for better translation performance. Evaluation results on datasets containing various types of statements show that NeurDP can decompile optimized binaries with 45.21% higher accuracy than state-of-the-art neural decompilation frameworks.

translated by 谷歌翻译

P3DC-Shot: Prior-Driven Discrete Data Calibration for Nearest-Neighbor Few-Shot Classification

Shuangmei Wang , Rui Ma , Tieru Wu , Yang Cao

分类：计算机视觉

2023-01-02

Nearest-Neighbor (NN) classification has been proven as a simple and effective approach for few-shot learning. The query data can be classified efficiently by finding the nearest support class based on features extracted by pretrained deep models. However, NN-based methods are sensitive to the data distribution and may produce false prediction if the samples in the support set happen to lie around the distribution boundary of different classes. To solve this issue, we present P3DC-Shot, an improved nearest-neighbor based few-shot classification method empowered by prior-driven data calibration. Inspired by the distribution calibration technique which utilizes the distribution or statistics of the base classes to calibrate the data for few-shot tasks, we propose a novel discrete data calibration operation which is more suitable for NN-based few-shot classification. Specifically, we treat the prototypes representing each base class as priors and calibrate each support data based on its similarity to different base prototypes. Then, we perform NN classification using these discretely calibrated support data. Results from extensive experiments on various datasets show our efficient non-learning based method can outperform or at least comparable to SOTA methods which need additional learning steps.

translated by 谷歌翻译

Edge Enhanced Image Style Transfer via Transformers

Chiyu Zhang , Jun Yang , Zaiyan Dai , Peng Cao

分类：计算机视觉

2023-01-02

In recent years, arbitrary image style transfer has attracted more and more attention. Given a pair of content and style images, a stylized one is hoped that retains the content from the former while catching style patterns from the latter. However, it is difficult to simultaneously keep well the trade-off between the content details and the style features. To stylize the image with sufficient style patterns, the content details may be damaged and sometimes the objects of images can not be distinguished clearly. For this reason, we present a new transformer-based method named STT for image style transfer and an edge loss which can enhance the content details apparently to avoid generating blurred results for excessive rendering on style features. Qualitative and quantitative experiments demonstrate that STT achieves comparable performance to state-of-the-art image style transfer methods while alleviating the content leak problem.

translated by 谷歌翻译

A RL-based Policy Optimization Method Guided by Adaptive Stability Certification

Shengjie Wang , Fengbo Lan , Xiang Zheng , Yuxue Cao , Oluwatosin Oseni , Haotian Xu , Yang Gao , Tao Zhang

分类：机器人 | 机器学习

2023-01-02

In contrast to the control-theoretic methods, the lack of stability guarantee remains a significant problem for model-free reinforcement learning (RL) methods. Jointly learning a policy and a Lyapunov function has recently become a promising approach to ensuring the whole system with a stability guarantee. However, the classical Lyapunov constraints researchers introduced cannot stabilize the system during the sampling-based optimization. Therefore, we propose the Adaptive Stability Certification (ASC), making the system reach sampling-based stability. Because the ASC condition can search for the optimal policy heuristically, we design the Adaptive Lyapunov-based Actor-Critic (ALAC) algorithm based on the ASC condition. Meanwhile, our algorithm avoids the optimization problem that a variety of constraints are coupled into the objective in current approaches. When evaluated on ten robotic tasks, our method achieves lower accumulated cost and fewer stability constraint violations than previous studies.

translated by 谷歌翻译